A stochastic spectral analysis of transcriptional regulatory cascades 
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The past decade has seen great advances in our understanding of the role of noise in gene regulation 
and the physical limits to signaling in biological networks. Here we introduce the spectral method for 
computation of the joint probability distribution over all species in a biological network. The spectral 
method exploits the natural eigenfunctions of the master equation of birth-death processes to solve 
for the joint distribution of modules within the network, which then inform each other and facilitate 
calculation of the entire joint distribution. We illustrate the method on a ubiquitous case in nature: 
linear regulatory cascades. The efficiency of the method makes possible numerical optimization 
of the input and regulatory parameters, revealing design properties of, e.g., the most informative 
cascades. We find, for threshold regulation, that a cascade of strong regulations converts a unimodal 
input to a bimodal output, that multimodal inputs are no more informative than bimodal inputs, 
and that a chain of up-regulations outperforms a chain of down-regulations. We anticipate that this 
numerical approach may be useful for modeling noise in a variety of small network topologies in 
biology. 



Transcriptional regulatory networks are composed of 
genes and proteins, which are often present in small num- 
bers in the cell [TJ [5], rendering deterministic models 
poor descriptions of the counts of protein molecules ob- 
served experimentally [3j IU [9]. Probabilis- 
tic approaches have proven necessary to account fully for 
the variability of molecule numbers within a homogenous 
population of cells. A full stochastic description of even a 
small regulatory network proves quite challenging. Many 
efforts have been made to refine simulation approaches 
[TO] IT!] 112] 113] H4] , which are mainly based on the vary- 
ing step Monte Carlo or 'Gillespie' method [THIH!]- Yet 
expanding full molecular simulations to larger systems 
and scanning parameter space is computationally expen- 
sive. On the other hand the interaction of many protein 
and gene types makes analytical methods hard to im- 
plement. A wide class of approximations to the master 
equation, which describes the evolution of the probabil- 
ity distribution, focuses on limits of large concentrations 
or small switches [TTJ, [TBI US]- Approximations based 
on timescale separation of the steps of small signaling 
cascades have been successfully used to calculate escape 
properties [2711 |2"T] |2"2] . 

In this paper we introduce a new method for calculat- 
ing the steady-state distributions of chemical reactants. 
The procedure, which we call the spectral method, re- 
lies on exploiting the natural basis of a simpler problem 
from the same class. The full problem is then solved nu- 
merically as a an expansion in this basis, reducing the 
master equation to a set of linear algebraic equations. 
We break up the problem into two parts: a preprocess- 



ing step, which can be solved algorithmically; and the 
parameter-specific step of obtaining the actual probabil- 
ity distributions. The spectral method allows for huge 
computational gains with respect to simulations. 

We illustrate the spectral method for the case of regula- 
tory cascades: downstream genes responding to concen- 
trations of transcription factors produced by upstream 
genes which are linked to external cues. Cascades play 
an important role in a diversity of cellular processes 
[23l l24l [25], from decision making in development [26] 
to quorum sensing among cells [27]. We take a coarse- 
grained approach, modeling each step of a cascade with 
a general regulatory function that depends on the copy 
number of the reactant at the previous step (cf. Fig. [TJ . 
While the method as implemented describes arbitrary 
regulation functions, we optimize the information trans- 
mission in the case of the most biologically simple reg- 
ulation function: a discontinuous threshold, in which a 
species is created at a high or low rate depending on the 
copy count of the species directly upstream. In the next 
sections, we outline the spectral method and present in 
detail our findings regarding signaling cascades. 



METHOD 

We calculate the steady-state joint distribution for L 
chemical species in a cascade (cf . Fig. [TJ . The approach 
we take involves two key observations: the master equa- 
tion, being linear. [40] benefits from solution in terms of 
its eigenfunctions; and the behavior of a given species 



2 



a 

B 

/ \ 

A D 

V / 

C 



o 



•©- 



© 



B' 

/ \ 

A' D' 

V /■ 

c 



©... 



0... 



FIG. 1: A schematic representation of a general signaling 
cascade. Interactions between species of interest may include 
intermediate processes; we take a coarse-grained approach, 
condensing these intermediate processes into a single effective 
regulatory function. For example, the regulatory function q n 
describes the creation rate of a species with copy count m as 
a function of the copy count n of the previous species. 



should depend only weakly on distant nodes given the 
proximal nodes. 

The second of these observations can be illustrated suc- 
cinctly by considering a three-gene cascade in which the 
first may be eliminated by marginalization. For three 
species obeying s n -^-> to as in Fig. [I] we have the 
linear master equation 

Psnm = P [gP(s-l)nm ~ 9Psnm + (s + l)P( s +l)nm ~ SPsnm\ 
+ ( ?sPs(n-l)m — QsPsnm + { n + ^)Ps(n+l)m ~ n Psnm 
+P [qnPsn(m-l) ~ q-nPsnm + (™ + l)Psn(m+l) ~ mp snm ] .(1) 

Here time is rescaled by the first gene's degradation rate, 
so that each gene's creation rate {g, q Sl or q n ) is normal- 
ized by its respective degradation rate; p and p are the 
ratios of the first and third gene's degradation rate to the 
second's, respectively. 

To integrate out the first species, we sum over s. We 
then introduce <?„, the effective regulation of n, by 



preserved. Armed with the Markovian approximation 
the equation for the remaining two species simplifies to 

Prim — 9n—lP(n—l)m QriPnrn ~t~ (ri ~T" l)_P(n+l)m ^Pnm 
+P [QnPn(rn-l) ~ 9nPn m + {m + l)p n (m+l) - mp nm ] .(3) 

This procedure can be extended indefinitely for a cascade 
of arbitrary length L, in which modules consisting of pairs 
of adjacent species are each described by two-dimensional 
master equations. 

The distribution for the first two species is obtained by 
summing over all other species, which gives an equation 
of the same form as Eqn. [3] but with g n = g = constant. 
If instead the input distribution is an arbitrary p n , the 
distribution for the first two species is still described by 
Eqn. [3j but with g n calculated recursively from p n via 
9n = (-np n +g„-iPn~i + (n+l)p n+ i)/p n with 50 = Pi/po 
to initialize. jH] Describing the start of a cascade (with 
arbitrary input distribution) and describing subsequent 
steps both amount to solving Eqn. [3] with g n given by 
either the recursive equation or Eqn. [2] respectively. 

We solve Eqn. [3] by defining the generating func- 
tion US] G(x,y) = J2 n , m PnmX n y m over complex vari- 
ables x and y.[42 It will prove more convenient to 
write the generating function in a state space as \G) = 
Y,n m Pnm\n, to), [43] with inverse transform 

Pnm — 

(n, m\G), where the states \i) and for i € {n, m}, 
along with the inner product — Sw, define the pro- 
tein number basis. With these definitions, Eqn. [3] at 
steady-state becomes = H\G), where 



H = b+b n (n) + pb+b m (n). 



(4) 
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Here we have made the Markovian approximation that 
s is conditionally independent of m given n. Gener- 
ally speaking, the probability distribution depends on all 
steps of the cascade. However since there are no loops in 
the cascades we consider here, we assume in Eqn. [2] that 
at steady-state each species is not affected by species two 
or more steps downstream in the cascade. The validity of 
the Markovian approximation is tested using both a non- 
Markovian tensor implementation of the spectral method 
and a stochastic simulation using the Gillespie algorithm 
[16j . as discussed in Supplementary Material. We find 
that the approximation produces accurate results for all 
but the most strongly discontinuous regulation functions; 
even in these cases qualitative features such as modality 
of the output distribution and locations of the modes are 



Here we have introduced raising and lowering operators 
in protein space [251 EM [El 131] obeying bf\i) = \i+l) — 
\i) for i € {rt,m}, b~(n)\n) = n\n— 1) — g n \n) and 
&~ (n)|n, to) = m\n, to — 1) — q n \n, m) . [44] and the regu- 
lation functions have become operators obeying g n \n) — 
g n \n) and q n \n) = q n \n). 

Were the operators b~{n) and b^n) not rt-dependent, 
H would be easily diagonalizable. In fact, this corre- 
sponds to the uncoupled case, in which there is no regu- 
lation, and both upstream and downstream gene undergo 
independent birth-death processes with Poisson steady- 
state distributions. We exploit this fact by working with 
the respective deviations of g n and q n from some con- 
stant creation rates g and q. Then H can be partitioned 
as H = Hq + H i , where 



H = b+b n + pb^b m 
Hi = b+f n + pb+A n , 



(5) 
(6) 



and we define new operators b n \n) = n\n— 1) — g\n), 
b~\m) = m\m - l)-q\m), f„ = g-g n , and A„ = q-q n . 
T n and A n capture the respective deviations of g n and q n 
from g and q, and Ho is diagonal in the eigenbases \ j) and 
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\k) of the uncoupled birth-death processes at rates g and 
q respectively; [45] specifically H \j, k) = (j + pk) \j, k) . 
Projecting Eqn. [3] onto the eigenbasis yields the linear 
equation of motion 



fc-i 



0, (7) 



where G jk = (j,k\G), T n , = J2 n (g - g n )(j\n)(n\j'), and 
Ajj' = J2 n (l~1n)(j\n)(n\j'). Eqn. [7] exploits the subdi- 
agonal nature of the fc-dependence; it is initialized using 
Qjo _ ^ p n (j\n), then solved exactly by matrix in- 
version for each subsequent k. The joint distribution is 
retrieved via the inverse transform as 



Pn 



jk 



[n\j)GP k (m\k). 



(8) 



One computational advantage is that the overlap in- 
tegrals (n\j) and (j\n) need only be evaluated explicitly 
for (n\j = 0) = e-9g n /n\ and (j\n = 0) = (-g) j /j\; all 
other values can be obtained recursively using the selec- 
tion rules (n\j + 1) = (n — l\j) — (n\j) and (j\n + 1) = 
(j — l\n) + (i|n).@B] The same holds for (m\k), taking 
n — > to, j — > k, and g — > q. Note that once g and g have 
been chosen, [17] the calculation can be separated into a 
preprocessing step, in which the matrices (n\j), (j\n), 
and (m\k) are calculated (and potentially reused at sub- 
sequent steps of the cascade or for subsequent steps in 
an optimization), and the actual step of calculating G jfc 
via Eqn. [JJ 

By exploiting the basis of the uncoupled system, we 
have reduced Eqn. [3] to a set of simple linear algebraic 
equations. |48] Eqn.[7j which dramatically speeds up the 
calculation without sacrificing accuracy (cf. Results and 
Supplementary Material). The method is applicable for 
any input function g n and regulation function q n . So- 
lutions using other bases and further generalizations to 
systems with feedback will be discussed in future work. 



RESULTS 

The spectral method is fast and accurate 

To demonstrate [49] the accuracy and computational 
efficiency of the spectral method, we compare it both 
to an iterative numerical solution of Eqn. [3] and to a 
stochastic simulation using the 'Gillespie' algorithm [16 
for a cascade of length L — 2 with a Poisson input (<?„ = 
g = constant) and the discontinuous threshold regulation 
function 



<7_ for n < uq 
q + for n > hq. 



(9) 



The spectral method achieves an agreement up to ma- 
chine precision with the iterative method in ^O.Ols, 



which is ^1000 times faster than the iterative method's 
runtime and ^10 8 faster than the runtime necessary for 
the stochastic simulation to achieve the same accuracy; 
see Supplementary Materials for details. The huge gain in 
computational efficiency over both the iterative method 
and the stochastic simulation makes the spectral method 
extremely useful, particularly for optimization problems, 
in which the probability function must be evaluated mul- 
tiple times. In the next sections we exploit this feature to 
optimize information transmission in signaling cascades. 



Information processing in signaling cascades 

Linear signaling cascades are a ubiquitous feature 
of biological networks, used to transmit relevant infor- 
mation from one part of a cellular system to another 
[23] l24l [25] 126} [27] . Information processing in a cas- 
cade is quantified by the mutual information |33j . which 
measures in bits how much information about an input 
signal is transmitted to the output signal in a noisy pro- 
cess. For a cascade of length L, the mutual informa- 
tion between an input species (with copy number n%) 
and an output species (with copy number ni)[SU] is 
1 = 12 nunL P( n i'"i) lo g2[p( n i>"-L) /p{ni)p(n L )]. In this 
study we define the capacity /* as the maximum mutual 
information over either regulatory parameters, the input 
distribution, or both. Depending on the signal to noise 
ratio, a high-capacity cascade functions either where the 
input signal is strongest or where the transmission pro- 
cess is least noisy [M] [5S] . 

We first consider a cascade of length L in which the 
regulation function q n is a simple threshold (Eqn.|9| with 
fixed parameters that are identical for each cascade step. 
It is worth noting that while a threshold-regulated cre- 
ation rate represents the simplest choice biologically, it 
is the most taxing choice computationally: as the dis- 
continuity A = — g_| increases, we find both that 
(a) a larger cutoff K in eigenmodes is required for a de- 
sired accuracy, and (b) the accuracy of the Markovian ap- 
proximation decreases (cf. Supplementary Material). The 
results herein therefore constitute a stringent numerical 
challenge for the spectral method. 

We take the input p{n{) to be a Poisson distribution 
(i.e. g n = g = constant). In the extreme cases, when the 
threshold is infinite or zero, the output is a Poisson distri- 
bution centered at q_ or q + , respectively. Similarly, when 
the input median is below or above threshold, the output 
mean should be biased toward g_ or q + , respectively. For 
example, in Fig. [2j\, {n\) < no, and the output distri- 
bution is shifted toward This effect is amplified at 
each step of the cascade, such that (til) — ► 9- for large 
L. Similarly, (ni) — > q+ for large L when (m) > no (Fig. 
2p). When (m) ~ n (Fig. [2^3), the output is balanced 
between q^ and q + ; if the discontinuity A = \q + — q_\ is 
sufficiently large, the output is bimodal, as discussed in 
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more detail in the next section. 

Mutual information / decreases monotonically with L 
for all (ni) (cf. Fig. [2J 7 ), as required by the data pro- 
cessing inequality |36j (i.e. one cannot learn more infor- 
mation from the output of an (X+l)-gene cascade than 
one could from an L-gene cascade with identical regula- 
tion, only less). / is maximal for (m) ~ no which makes 
intuitive sense, as it corresponds to the input taking ad- 
vantage of both rates <?_ and q + roughly equally in pro- 
ducing the output. A simple calculation quantifies this 
intuition. Approximating the steady-state distribution 
for the moment as a strict switch conditional on no (i.e. 
v{n L \nx) = P-(n L ) if m < n and p(n L \ni) = p+(n L ) if 
m > no for some distributions p±(ni,)), it follows from 
the definition of / that 



'switch 



± n L 



K±p±{n L ) 



(10) 



where S = -J2± 7T ±log 2 n±, tt_ = E ni <„ D K™i) and 
7r + = J2 ni >n P( n i) = 1 — 7T_. If there is little overlap 
between p_(riL) and p+(nt), then /switch ~ S, which is 
maximal when 7r_ = 7r + , i.e. when the median of the 
input distribution p(n\) lies at the threshold no. Addi- 
tionally, since the maximal value of S (and /switch* since 
the summand of the second term in Eqn. [10] is always 
nonnegative) is 1 bit, this calculation also suggests that 
the capacity of threshold regulation (in the limit of strict 
switch-like behavior) is limited to 1 bit. Again, this result 
is intuitive, as the cascade is only passing on the binary 
information of whether particles in the input distribution 
are below or above the threshold. 

In an experimental setup one might have access to 
only the mean response (or "transfer function"), or the 
variance in response across cells (or "noise"), of a sig- 
naling cascade to its input. Since our method yields 
the full distributions, such summary statistics are read- 
ily computed. Despite the sharpness of threshold regu- 
lation, the transfer functions are quite smooth even at 
L = 2 (cf. Fig. [2J3). The effect of the intrinsic noise is to 
smooth out a sharp discontinuity in creation rates, pro- 
ducing a continuous mean response. The transfer func- 
tions shown are least-squares fit to Hill functions of the 
form {n L \m) = a_ + (a+ - a-)(m) h /[(n 1 ) h + (k d ) h ). 
As one would expect, for all L, best fit values of a_, a + 
are near the rates g_ and q+ respectively, and best fit 
kd values are near the threshold no- As L increases, the 
transfer function sharpens, and cooperativity h increases 
(cf. inset in D), due to the amplified migration of the out- 
put to either q_ and q + in longer cascades (as in Figs. 
|2^and|2p). 



The strength of the noise increases with L (cf. Fig. 
consistent with the reduction in I with L (cf. Fig. 



2 3), 
2 ? ), 



and the noise is peaked at the threshold. The "switch 



approximation" (cf. Eqn. 10 1 illustrates the gain in infor- 
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FIG. 2: Transfer functions and noise in a signaling cascade. 
A-C: Plots of input distribution p(n\) (black) and output 
distributions p(nr,) (colors; see legend) for various cascade 
lengths L. Input distribution is a Poisson centered at (m) = 4 
(in A), (ni) = n = 6 (in B), or (m) = 10 (in C). The 
regulation function q„ for all steps is a threshold (Eqn. |9| 
shown in D (black line with dots), with parameters q_, q+, 
and no overlaid as dashed lines in A-C. The degradation rate 
ratio is p = 1, and Eqn. [3] is solved using the spectral method 
with q = 10 and g = (g n ) for each step in the cascade. D-F: 
Transfer functions (average output {til,}) (D), noise (standard 
deviation of the output cr(ni)) (E), and mutual information 
I (F) as functions of average input (m) for various cascade 
lengths L (colors as in A-C). As in A-C, input is Poisson at 
every (ni); dashed lines correspond to the specific (m) values 
in A, B, and C. Inset in D: Cooperativity feasa function of 
L. 



mation when the median of the input coincides with the 



threshold; the "small noise approximation" [34 , 35j, how- 
ever, illustrates the loss in information when the peak of 
the input coincides with the peak of the noise. The trade- 
off between these two trends thwarts information trans- 
mission with unimodal input distributions (e.g., those 
used in Fig. |2| and suggests an input distribution with 
two or more modes should be able to transduce more in- 
formation. Such distributions are the subject of study 
below, and, in related work [35J |37] are shown to be the 
optimal strategy and to be observed in biology for a regu- 
latory system in which peak noise and threshold coincide. 



Bimodal output from a unimodal input 

A striking feature of Fig.[2|3 is that the unimodal input 
is converted to a bimodal output for cascades of length 
L = 3 or longer. Bimodality can arise from a system with 
two genes whose proteins repress each other or from a sin- 
gle gene whose proteins activate its own expression. Here 
we demonstrate that cascades with sufficiently strong 
regulation constitute an information-optimal mechanism 
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for a cell to achieve bimodality. 

Recall that Fig. [2)3 corresponds to the case where the 
input distribution is optimally matched with the regula- 
tion function, i.e. the bimodal output represents optimal 
information transmission. By optimizing over the mean 
g of a Poisson input distribution, we find that the most 
informative output distribution in a cascade with uni- 
modal input can be unimodal or bimodal, depending on 
regulatory parameters and the length of the cascade. Fig. 
[3j\ shows examples of regulation functions which produce 
output distributions that are unimodal, bimodal for cas- 
cades as long or longer than some L* (which we term 
"persistent" bimodality) , and bimodal for short cascades 
but unimodal at both initial and final nodes for longer 
cascades (which we term "localized" bimodality). 

Bimodality is found both in cascades in which each step 
is down-regulating, which we call "AC" cascades, and in 
those in which each step is up-regulating, which we call 
"DC" cascades. In DC cascades, as seen in the insets of 
panels 1-3 in Fig. [3|Y, the average output either mono- 
tonically decreases or monotonically increases with L. In 
the former case, since q- < no, the probability that the 
output is below the threshold given that the input is be- 
low threshold is large. Successive such regulations drive 
the probability of being below the threshold towards 1, 
successively decreasing (n) at each step in the cascade. In 
the latter case, since q + > n , the same picture holds, and 
(n) monotonically increases with L. Whether the mono- 
tonically increasing or decreasing behavior is the more 
informative is determined by the relationship among q + , 
g_, and uq. In AC cascades, an analogous picture holds 
but with alternation: 7r_ < 7r + for the even-numbered 
links (cf. Eqn. 10), and the AC condition g_ > q + leads 



to 7r + < 7r_ for the odd- numbered links, as illustrated 
in the insets of panels 4-6 in Fig [3j\.. These behaviors 
motivate the names "AC" and "DC," analogous to al- 
ternating and direct current flow. Performance of AC 
and DC cascades is compared in more detail in the next 
section. 

Fig. [3)3 shows a phase diagram of optimal output 
modality as a function of the rates g_ and q + : bimodality 
is found at high values of the discontinuity A = \q + — g_ | 
(specifically, for A > 11 in AC circuits and A > 12 in 
DC circuits when no = 8). Intuitively, since the weight 
of the output is distributed between g_ and q + for long 
cascades, increasing their separation spreads the weights 
apart and creates bimodal distributions. Furthermore, 
as A increases, the bimodality becomes more robust: it 
goes from localized to persistent, and its onset occurs at 
a smaller cascade length L*. The inset of Fig. [3)3 shows 
that capacity I* also increases with A; cascades with bi- 
modal output therefore have higher capacities than those 
with unimodal output. As A increases, the information 
transmission properties of a regulatory cascade are better 
approximated by simple switch-like regulation (cf. Eqn. 
10 I. In short, summarizing the input distribution by 7r + 



and 7r_ is a more informative summary of the distribu- 
tion as the regulation becomes more discontinuous. 



Channel capacity in AC/DC cascades 

Our setup provides a way to ask quantitatively 
whether a cascade with down-regulating steps (AC) can 
transmit information with more or less fidelity than a cas- 
cade with down-regulating steps (DC). Since a cell must 
expend time and energy to make proteins, a fair com- 
parison between cascade types can only be made when 
the species involved in each type are present in equal 
copy number. As in |38j . we introduce the objective 
function L = I — A(n) where I is mutual information 
and (n) = J2i=i( n e) / L ^ s an average copy count over all 
species in the cascade. Here A represents the metabolic 
cost of making proteins, and optimizing L for different 
values of A allows a comparison of AC and DC capacities 
/* at similar values of (n). 

For both AC and DC cascades, I* increases with (n) as 
more proteins are made available to encode the signal, [51 
and I* decreases [52] with L at all (n) (cf. Fig. [I}\.). 
Both AC and DC capacities converge to an L-dependcnt 
asymptotic value at high copy count, but DC cascades 
attain higher capacities per output protein than AC cas- 
cades. The difference is most pronounced at low copy 
count ((n) < 7), and more pronounced still for longer 
cascades. The difference is easily explained: AC and 
DC cascades of the same length with the same discon- 
tinuity A = \q + — q_\ have the same capacity but have 
different mean numbers of proteins. Recall from Fig. [3)3 
that large A leads to high-capacity, bimodal solutions. 
The difference between AC and DC cascades is in the 
placement of their optimal distributions for a given A. 
We observe that optimal AC cascades tend to exhibit 
(n) > n , while optimal DC cascades tend to exhibit 
(n) < n . Ultimately, this allows DC cascades to achieve 
the same capacity for the same regulation parameters 
(cf. Fig. [3j3 inset), but use fewer proteins. These results 
suggest that DC cascades transmit with higher fidelity 
per protein than AC cascades when protein production 
is costly. 



The most informative input to a threshold is 
bimodal 



If the first species is governed by more than a sim- 
ple birth-death process, the input to a cascade will not 
be a simple Poisson distribution. To investigate the role 
of input multimodality in information transmission, we 
consider inputs defined by a mixture of Poisson distribu- 
tions, p(m) = J2f=i Tie _ff< 5"V n i!j ( witn J2f=i n i = !)• 
As before, we expect information to increase with copy 
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FIG. 3: Optimal output modality in cascades with unimodal 
input. A: Plots of optimal input distribution p*(m) (black) 
and optimal output distributions p*(ul) (colors; see legend) 
for different cascade lengths L (optimal input distributions are 
qualitatively the same; only that for L — 2 is shown), corre- 
sponding to regulation functions q n (identical for each step) 
plotted underneath (p = 1, and solutions used q — 10 and 
g = (<7n)). Mutual information I is optimized as a function of 
the mean g of the Poisson input distribution. Magenta num- 
bers on plots correspond to magenta points in B. Insets show 
plots of average output (ul) vs. cascade length L. In the first 
column of A, the output is always unimodal; in the second 
column, the output is bimodal for cascade lengths L > L* for 
some L* ("persistent" bimodality); in the third column, the 
output is bimodal for a range of L values, then unimodal once 
more for large L ( "localized" bimodality). The first row shows 
"DC" cascades, in which each step is up-regulating, and the 
second row shows "AC" cascades, in which each step is down- 
regulating. B: Phase diagram of optimal output modality as 
a function of q~ and q+ (no = 8). White is unimodal, and 
color is bimodal, with color corresponding to L* . Distinction 
between persistent (no 'x') and localized ('x') bimodality is 
shown up to L — 10. Dashed line separates DC cascades from 
AC cascades. Inset: Capacity I* in bits as a function of q_ 
and q+ for the same data. 



number, and we use the objective function L when opti- 
mizing over the input distribution p{n\). 

All optimal input distributions with Z > 2 are bi- 
modal, with one mode on either side of the threshold (cf. 
Fig.Efe). When Z is 3 or more, either all but two 7Tj values 
are driven by the optimization to 0, or all the g± values 
with nonzero weights are driven to one of two unique 
values. The two modes are roughly equally weighted (i.e. 

~ 7T2 ~ 0-5), consistent once more with our calcula- 
tion that the median of the optimal input distribution 
falls roughly at the threshold (cf. Eqn. 10). As an addi- 
tional verification, a plot of capacity I* vs. Z in the inset 
of Fig. [4}3 reveals that I* remains constant for Z = 2 
and beyond. A threshold regulation function presents a 
binary choice, and the optimal input is a bimodal distri- 
bution that equally utilizes both sides. Lastly, we point 
out that the capacities in the inset of Fig. [4j3 are below 1 
bit. Even with a bimodal input distribution, a short cas- 
cade (L = 2), and strong regulation functions (i.e. large 
discontinuities A), we do not find capacities above 1 bit. 
This is consistent with our calculation under the approx- 
imation of threshold regulation as a switch (cf. Eqn. 10 1 



and with the intuition that a threshold represents a bi- 
nary decision. 



CONCLUSIONS 

We have introduced a method, the spectral method, 
which exploits the linear algebraic structure of the mas- 
ter equation, and expands the full problem in terms of its 
natural eigenfunctions. We have illustrated our method 
by probing the optimal transmission properties of signal- 
ing cascades with threshold regulation. We have shown 
that sufficiently long cascades with sufficiently strong reg- 
ulation functions optimally convert a unimodal input to a 
bimodal output. A bimodal input is optimal for informa- 
tion transmission across a threshold, and a multimodal 
input offers no further processing power. Sustained bi- 
modality of the input distribution requires large disconti- 
nuities A between the production rates below and above 
the threshold. The value of A controls the maximum in- 
formation transmitted by a cascade with threshold reg- 
ulation in a similar way for cascades of up-regulations 
(DC) and cascades of down-regulations (AC), but a DC 
cascade outperforms an AC cascade by using fewer av- 
erage copies of its species. We emphasize that the ap- 
plication of the spectral method to signaling cascades 
represents only a beginning. Variations on the natural 
bases in which to expand, and extensions of the method 
to other small network topologies, will be the subject of 
future work. More generally, however, we anticipate that 
the method will prove useful in the direct solution of a 
large class of master equations describing a wide variety 
of biological systems. 
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FIG. 4: A: Capacities of AC and DC cascades. Capacity I* 
vs. average copy count (n) (over all species) for AC (circles) 
and DC (plus signs) cascades of different lengths L (color), 
with Poisson input distributions. Results were obtained by 
optimizing the objective function L over all parameters (g_, 
q+, no, and g) for A = 1 x 10 -6 — 3 x 10 -2 (no is constrained 
to be an integer, the regulation function is the same for every 
step, /0 = 1, and solutions used q — 10 and g = {gn))- Lines 
show convex hulls. B: Optimal input distributions with differ- 
ent numbers Z of Poisson distributions. The cascade length 
is L = 2, the degradation rate ratio is p = 1, and the regula- 
tion function q n is plotted in the bottom panel; solutions used 
q — 10 and g — {g n )- The objective function L is optimized 
with A = 10~ 4 over all input parameters gt and 7Tj. Inset 
Capacity /* vs. Z, averaged at each Z over 7 optimizations 
with different initial conditions. 



an iterative numerical solution and to a stochastic sim- 
ulation (using the varying step Monte Carlo or 'Gille- 
spie' method [HJ [16]) of Eqn. |] Fig. |}\ shows the 
agreement among output distributions p m generated by 
the three methods for a cascade of length L = 2 with 
a Poisson input {g n = g = constant) and the thresh- 
old regulation function in Eqn. [9j The spectral method 
achieves accuracy up to machine precision in ^0.01 s, 
which is ^1000 times faster than the iterative method's 
runtime and ^10 8 faster than the runtime necessary for 
the stochastic simulation to achieve the same accuracy 
(cf. Fig. [5j3). As a measure of error we use the Jensen- 
Shannon divergence |39j (a measure in bits between two 
probability distributions) between the distribution p nm 
generated by the iterative method and that generated 
by either the spectral method or the stochastic simula- 
tion. We plot this measure against the runtime of either 
method, scaled by the runtime of the iterative method, 
in Fig. [5k. 



Validity of the Markovian approximation 



For a cascade of length L, we reduce an L-dimensional 
master equation to a set of 2-dimcnsional equations 
by employing the Markov approximation, i.e. that each 
species is conditionally independent of distant nodes 
given proximal nodes (cf. main text). Here we compare 
results under this approximation with those from a solu- 
tion of the full master equation, using both a stochastic 
simulation [HP and a non-Markovian implementation of 
the spectral method. 

The non-Markovian spectral method is implemented 
as follows. The full master equation for the process 

92("l) <?3("2) q L (n L -l) 
n\ > ?1 2 ► . . . > TIL IS 



p(n) = gp(n- e x ) - gp(ft) + (n 1 + l)p(n+ ei) - nip(n) 

L 



e=2 



+(n e + l)p(n + ei) - n e p(n)] 



(11) 



SUPPLEMENTARY MATERIAL 
The spectral method is fast and accurate 

To demonstrate the accuracy and computational effi- 
ciency of the spectral method, we compare it both to 



where time is rescaled by the degradation rate of the 
first species, g is the creation rate of the first species, qi 
is the creation rate of the £th species, creation rates are 
normalized by corresponding degradation rates, pe is the 
ratio of the degradation rate of the £th species to that 
of the first, n = (n x , ... ,7^), and eg represents a 1 in 
n^th direction. Denoting . . . , ji) as the eigenstate of 
species n at constant rates g, q2, ■ ■ ■ , q~L respectively, Eqn. 
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FIG. 5: Demonstration of the spectral method's accuracy and 
efficiency. A: Output distributions p m obtained by solving 
Eqn. [3] with g„ = g = constant using the spectral method 
described in the main text (blue triangles), an iterative nu- 
merical method (red circles), and a stochastic simulation [16] 
(green squares). Also shown is the input p n (a Poisson distri- 
bution with mean g = 8), and the regulation function q n (Eqn. 
[9| with g_ = 1, q+ — 13, and no = 8; the degradation rate 
ratio is p = 1. Spectral solution used q — 10 and mode num- 
ber cutoffs J = K — 50, and all solutions used copy count 
cutoffs N = M = 50. B: Jensen-Shannon divergence |39] 
between p nm obtained using the iterative numerical method 
and that obtained using the spectral method (blue triangles) 
or stochastic simulation (green squares), plotted against the 
latter two methods' respective runtimes. Runtimes are scaled 
by iterative method's runtime, 15.9 s. Spectral method data 
obtained by varying K from 3 to 12, 589; plateau begins at 
K ~ 50. Stochastic simulation data obtained by varying in- 
tegration time from 100 to 2 x 10 7 , in units of the upstream 
gene's reciprocal degradation rate. 
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operator A 
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by iteration, 
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where (!• 
the deviation 

( fl2| is solved 
obtained via inverse transform 



,Jl\G) 



,(12) 



and 



31- 

l\fi 



_!>. Eqn. 
initialized with 
distribution 



is 



p(ni,...,n L ) = ^2 ("ilii> ■ • • { n L\jL)G n 



(13) 



For a cascade of length L = 4, with two different val- 
ues of the discontinuity A = \q + — q_\ (cf. Eqn. |9|, Fig- 
ure [6] compares the marginal distributions calculated un- 
der the Markov approximation (using both the spectral 
method and an iterative numerical solution) with those 
calculated from the full master equation (using both the 
non-Markovian spectral method and a stochastic simula- 
tion). When A is small there is full agreement between 
the Markovian and non-Markovian distributions (cf. Fig. 
[6j\., with A = 4.5); as A grows, the Markovian distribu- 
tions begin to deviate from the non-Markovian distribu- 
tions (cf. Fig. |6|3, with A = 8.5). The deviation is more 
pronouned at higher i, i.e. for species more downstream 
in the cascade. We emphasize, however, that important 
qualitative features of the distributions, such as modal- 
ity and locations of the modes, are retained under the 
approximation . 



A note on degradation rates 

The ratio p of a downstream gene's degradation rate to 
that of an upstream gene is fixed to 1 in all results in the 
main text. Increasing p is a computationally straight- 
forward way to obtain, e.g., a bimodal output, since it 
corresponds to the case in which the downstream gene 
equilibrates more quickly than the upstream gene, such 
that the output is simply a weighted sum of distributions 
peaked at each of the threshold values q- and q + (cf. Eqn. 
[9]). Specifically, in the p — > 00 limit, for the two-gene 
cascade n — ^ m, p m = n-e~ q - q™ /ml + 7r + e~ q +q™ /ml, 
where tt_ = £„<„ P„ and tt + = However 
most degradation rates are dominated by cell division, 
so degradation rate ratios far from than ~1 are unreal- 
istic. Our results demonstrate that even with all species 
operating on the same timescale, relatively short cascades 
with strong enough regulation provide a information- 
optimal mechanism of converting a unimodal signal to 
a bimodal signal. 
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FIG. 6: Marginal distributions from a Markovian and a non- 
Markovian stochastic description of a cascade. A: Marginal 
distributions for the species in a cascade of length L — 4 (col- 
ors indicate order of species in the cascade; see legend), with 
threshold regulation function q n shown in the bottom panel 
(q- — 0.5, q+ — 5, and no = 7). Distributions were obtained 
using the Markov approximation (cf. main text), either via 
the spectral method (dots) or via an iterative numerical so- 
lution (plus signs), and using the full master equation (Eqn. 
Ill, either via the non-Markovian spectral method (circles) 
or via stochastic simulation [16] (squares). B: As A, but with 

q+ = 9. 
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